information need
JIR-Arena: The First Benchmark Dataset for Just-in-time Information Recommendation
Yang, Ke, Ros, Kevin, Kumar, Shankar Kumar Senthil, Zhai, ChengXiang
Just-in-time Information Recommendation (JIR) is a service designed to deliver the most relevant information precisely when users need it, , addressing their knowledge gaps with minimal effort and boosting decision-making and efficiency in daily life. Advances in device-efficient deployment of foundation models and the growing use of intelligent wearable devices have made always-on JIR assistants feasible. However, there has been no systematic effort to formally define JIR tasks or establish evaluation frameworks. To bridge this gap, we present the first mathematical definition of JIR tasks and associated evaluation metrics. Additionally, we introduce JIR-Arena, a multimodal benchmark dataset featuring diverse, information-request-intensive scenarios to evaluate JIR systems across critical dimensions: i) accurately inferring user information needs, ii) delivering timely and relevant recommendations, and iii) avoiding irrelevant content that may distract users. Developing a JIR benchmark dataset poses challenges due to subjectivity in estimating user information needs and uncontrollable system variables affecting reproducibility. To address these, JIR-Arena: i) combines input from multiple humans and large AI models to approximate information need distributions; ii) assesses JIR quality through information retrieval outcomes using static knowledge base snapshots; and iii) employs a multi-turn, multi-entity validation framework to improve objectivity and generality. Furthermore, we implement a baseline JIR system capable of processing real-time information streams aligned with user inputs. Our evaluation of this baseline system on JIR-Arena indicates that while foundation model-based JIR systems simulate user needs with reasonable precision, they face challenges in recall and effective content retrieval. To support future research in this new area, we fully release our code and data.
- North America > United States > Massachusetts (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Europe > Monaco (0.04)
- Asia > Middle East > Jordan (0.04)
Personalized Jargon Identification for Enhanced Interdisciplinary Communication
Guo, Yue, Chang, Joseph Chee, Antoniak, Maria, Bransom, Erin, Cohen, Trevor, Wang, Lucy Lu, August, Tal
Scientific jargon can impede researchers when they read materials from other domains. Current methods of jargon identification mainly use corpus-level familiarity indicators (e.g., Simple Wikipedia represents plain language). However, researchers' familiarity of a term can vary greatly based on their own background. We collect a dataset of over 10K term familiarity annotations from 11 computer science researchers for terms drawn from 100 paper abstracts. Analysis of this data reveals that jargon familiarity and information needs vary widely across annotators, even within the same sub-domain (e.g., NLP). We investigate features representing individual, sub-domain, and domain knowledge to predict individual jargon familiarity. We compare supervised and prompt-based approaches, finding that prompt-based methods including personal publications yields the highest accuracy, though zero-shot prompting provides a strong baseline. This research offers insight into features and methods to integrate personal data into scientific jargon identification.
- Education (0.93)
- Health & Medicine > Therapeutic Area (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
- Information Technology > Communications (0.88)
Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery
Dash, Debadutta, Thapa, Rahul, Banda, Juan M., Swaminathan, Akshay, Cheatham, Morgan, Kashyap, Mehr, Kotecha, Nikesh, Chen, Jonathan H., Gombar, Saurabh, Downing, Lance, Pedreira, Rachel, Goh, Ethan, Arnaout, Angel, Morris, Garret Kenn, Magon, Honor, Lungren, Matthew P, Horvitz, Eric, Shah, Nigam H.
Despite growing interest in using large language models (LLMs) in healthcare, current explorations do not assess the real-world utility and safety of LLMs in clinical settings. Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner. Sixty six questions from an informatics consult service were submitted to GPT-3.5 and GPT-4 via simple prompts. 12 physicians assessed the LLM responses' possibility of patient harm and concordance with existing reports from an informatics consultation service. Physician assessments were summarized based on majority vote. For no questions did a majority of physicians deem either LLM response as harmful. For GPT-3.5, responses to 8 questions were concordant with the informatics consult report, 20 discordant, and 9 were unable to be assessed. There were 29 responses with no majority on "Agree", "Disagree", and "Unable to assess". For GPT-4, responses to 13 questions were concordant, 15 discordant, and 3 were unable to be assessed. There were 35 responses with no majority. Responses from both LLMs were largely devoid of overt harm, but less than 20% of the responses agreed with an answer from an informatics consultation service, responses contained hallucinated references, and physicians were divided on what constitutes harm. These results suggest that while general purpose LLMs are able to provide safe and credible responses, they often do not meet the specific information need of a given question. A definitive evaluation of the usefulness of LLMs in healthcare settings will likely require additional research on prompt engineering, calibration, and custom-tailoring of general purpose models.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- Europe > United Kingdom (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.14)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.46)
A Prompt Log Analysis of Text-to-Image Generation Systems
Xie, Yutong, Pan, Zhaoying, Ma, Jinge, Jie, Luo, Mei, Qiaozhu
Recent developments in large language models (LLM) and generative AI have unleashed the astonishing capabilities of text-to-image generation systems to synthesize high-quality images that are faithful to a given reference text, known as a "prompt". These systems have immediately received lots of attention from researchers, creators, and common users. Despite the plenty of efforts to improve the generative models, there is limited work on understanding the information needs of the users of these systems at scale. We conduct the first comprehensive analysis of large-scale prompt logs collected from multiple text-to-image generation systems. Our work is analogous to analyzing the query logs of Web search engines, a line of work that has made critical contributions to the glory of the Web search industry and research. Compared with Web search queries, text-to-image prompts are significantly longer, often organized into special structures that consist of the subject, form, and intent of the generation tasks and present unique categories of information needs. Users make more edits within creation sessions, which present remarkable exploratory patterns. There is also a considerable gap between the user-input prompts and the captions of the images included in the open training data of the generative models. Our findings provide concrete implications on how to improve text-to-image generation systems for creation purposes.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (5 more...)
- Information Technology > Information Management > Search (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.67)
Toward a Neural Semantic Parsing System for EHR Question Answering
Clinical semantic parsing (SP) is an important step toward identifying the exact information need (as a machine-understandable logical form) from a natural language query aimed at retrieving information from electronic health records (EHRs). Current approaches to clinical SP are largely based on traditional machine learning and require hand-building a lexicon. The recent advancements in neural SP show a promise for building a robust and flexible semantic parser without much human effort. Thus, in this paper, we aim to systematically assess the performance of two such neural SP models for EHR question answering (QA). We found that the performance of these advanced neural models on two clinical SP datasets is promising given their ease of application and generalizability. Our error analysis surfaces the common types of errors made by these models and has the potential to inform future research into improving the performance of neural SP models for EHR QA.
Forecasting User Interests Through Topic Tag Predictions in Online Health Communities
Adishesha, Amogh Subbakrishna, Jakielaszek, Lily, Azhar, Fariha, Zhang, Peixuan, Honavar, Vasant, Ma, Fenglong, Belani, Chandra, Mitra, Prasenjit, Huang, Sharon Xiaolei
The increasing reliance on online communities for healthcare information by patients and caregivers has led to the increase in the spread of misinformation, or subjective, anecdotal and inaccurate or non-specific recommendations, which, if acted on, could cause serious harm to the patients. Hence, there is an urgent need to connect users with accurate and tailored health information in a timely manner to prevent such harm. This paper proposes an innovative approach to suggesting reliable information to participants in online communities as they move through different stages in their disease or treatment. We hypothesize that patients with similar histories of disease progression or course of treatment would have similar information needs at comparable stages. Specifically, we pose the problem of predicting topic tags or keywords that describe the future information needs of users based on their profiles, traces of their online interactions within the community (past posts, replies) and the profiles and traces of online interactions of other users with similar profiles and similar traces of past interaction with the target users. The result is a variant of the collaborative information filtering or recommendation system tailored to the needs of users of online health communities. We report results of our experiments on an expert curated data set which demonstrate the superiority of the proposed approach over the state of the art baselines with respect to accurate and timely prediction of topic tags (and hence information sources of interest).
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.46)
- Overview > Innovation (0.34)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Epidemiology (0.88)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.69)
InformationWeek, serving the information needs of the Business Technology Community
Most managers feel euphoria when implementing a technology meant to enhance the workflow of a team or an organization. But they often overlook the details that help implement the technology successfully. The same sentiment can occur for managers who oversee data scientists, data engineers, and analysts examining machine learning initiatives. Every organization seems to be in love with machine learning. Because love is blind, so to speak, IT teams become the first line of defense in protecting that euphoric feeling.
InformationWeek, serving the information needs of the Business Technology Community
As COVID-19 vaccination rates rise, conversations about the future of work are picking up again. It's no longer the workplace of 2019; the landscape has changed significantly since then. The automated, digitized world of work that we knew would arrive "soon" is suddenly here, and many of those changes are here to stay. Chief information officers and IT leaders have a key role to play in facilitating employee adoption and encouraging buy-in for the future of an AI-enabled workforce. Organizations accelerated their digital transformation plans over the past year, or improvised along the way, to accommodate the rapid shift to virtual work.
InformationWeek, serving the information needs of the Business Technology Community
AI is seeping into just about everything, from consumer products to industrial equipment. As enterprises utilize AI to become more competitive, more of them are taking advantage of machine learning to accomplish more in less time, reduce costs and discover something whether a drug or a latent market desire. While there's no need for non-data scientists to understand how machine learning (ML) works, they should understand enough to use basic terminology correctly. Although the scope of ML extends considerably past what's possible to cover in this short article, following are some of the fundamentals. Before one can grasp machine learning concepts, they need to understand what machine learning terms mean.
A Methodology for Creating AI FactSheets
Richards, John, Piorkowski, David, Hind, Michael, Houde, Stephanie, Mojsilović, Aleksandra
As AI models and services are used in a growing number of highstakes areas, a consensus is forming around the need for a clearer record of how these models and services are developed to increase trust. Several proposals for higher quality and more consistent AI documentation have emerged to address ethical and legal concerns and general social impacts of such systems. However, there is little published work on how to create this documentation. This is the first work to describe a methodology for creating the form of AI documentation we call FactSheets. We have used this methodology to create useful FactSheets for nearly two dozen models. This paper describes this methodology and shares the insights we have gathered. Within each step of the methodology, we describe the issues to consider and the questions to explore with the relevant people in an organization who will be creating and consuming the AI facts in a FactSheet. This methodology will accelerate the broader adoption of transparent AI documentation.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Law (0.89)